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Abstract 

We investigate the potential of the Compact Muon Solenoid (CMS) detector at the 
Large Hadron Collider (LHC) to discriminate between two theoretical models predict- 
ing anomalous events with jets and large missing transverse energy, minimal super- 
symmetry and Little Higgs with T Parity. We focus on a simple test case scenario, 
in which the only exotic particles produced at the LHC are heavy color-triplet states 
(squarks or T-quarks), and the only open decay channel for these particles is into the 
stable missing-energy particle (neutralino or heavy photon) plus a quark. We find that 
in this scenario, the angular and momentum distributions of the observed jets are suf- 
ficient to discriminate between the two models with a few inverse fb of the LHC data, 
provided that these distributions for both models and the dominant Standard Model 
backgrounds can be reliably predicted by Monte Carlo simulations. 



1 Introduction 



Theoretical arguments strongly indicate that the Standard Model (SM) picture of elec- 
troweak symmetry breaking is incomplete, and numerous extensions of the SM at the elec- 
troweak scale have been proposed. It is expected that at least some of the new particles and 
interactions predicted by such extended theories will be discovered and studied at the LHC. 
The ultimate goal of the experiments is, of course, to determine the correct theory of physics 
at the TeV scale. This task may be quite complicated. In particular, it is quite likely that 
nature is described by one of the several models possessing the following features: 

• Physics at the TeV scale is weakly coupled, and there is a light Higgs (as motivated 
by precision electroweak data); 

• A number of new states are present at the TeV scale, and new particles can be paired 
up with the known SM states, with states in the same pair carrying identical gauge 
charges; 

• New states carry a parity quantum number distinct from their SM counterparts, im- 
plying that the lightest new particle (LNP) is stable; 

• The LNP is weakly interacting (as motivated by cosmological constraints on stable 
particles). 

The best known model of this class is the minimal supersymmetric standard model 
(MSSM). Other contenders include models with universal extra dimensions (UED) and Lit- 
tle Higgs models with T parity. Broadly speaking, all these theories share the same LHC 
phenomenology: the new physics production is dominated by the colored states, which are 
pair-produced, and then decay down to the LNP and SM states. The interesting final states 
then involve jets in association with missing transverse energy and possibly leptons and pho- 
tons. Only by studying the detailed properties of these objects can one hope to discriminate 
among the models. 

The most convincing way to discriminate between supersymmetry and its competitors 
is to measure the spin of the new particles: in the Little Higgs and UED models the new 
states and their SM partners have the same spin3, while in SUSY models their spins differ 
by 1/2. Measuring spin at the LHC, however, is notoriously difficult. Almost all existing 
proposals rely on the observation that, if the produced strongly interacting state decays via 
a cascade chain, angular correlations between the particles emitted in subsequent steps in 
the cascade carry information about spin. (See Refs. [IJEIEJIIJEJEJITIIB], as well as a 
recent review The availability of cascade decays with the right properties for this to 
work, however, depends on the spectrum and couplings of the model, and is by no means 
guaranteed. Moreover, a large amount of data is typically needed to alleviate combinatoric 
and other backgrounds. 

1 Robust discrimination between the Little Higgs and UED would require observing or ruling out the 
excited level-2 and higher KK excitations of the UED model, absent in the Little Higgs. 
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If unambiguous spin measurments are unavailable, the experiments can still attempt a 
more modest task of model discrimination, i.e. determining which of two or more specific 
theoretical models provides a better fit to the available data. Unlike a direct spin measure- 
ment, which would rule out the entire class of models with the wrong spin assignment, this 
approach can only discriminate between specific models. For example, if it is found that the 
minimal Littlest Higgs with T-parity (LHT) model cannot fit the data, it does not exclude 
the possibility that another model of the Little Higgs class could provide a better fit. Still, 
this approach can provide valuable information, and is well worth pursuing at the LHC, 
especially at the early stages. 

The goal of our study is to estimate the prospects for model discrimination with the 
CMS detector. As a test case, we consider a very simple scenario: We assume that the only 
new physics process observable at the LHC is pair-production of new color-triplet particles, 
followed by their decay into a quark and an LNP. This process occurs at the LHC at a 
significant rate over large parts of the parameter space of the MSSM, LHT, and UED models. 
Its detector signature is two hard jets (plus possibly additional jets from gluon radiation and 
showering) and missing transverse energy. Our assumption that no other signatures are 
observed allows us to focus on this channel alone, and to understand in detail the issues 
important for model discrimination. It would be straightforward to repeat our exercise with 
more complicated models for the exotic particle production (e.g. including color octet pair- 
production channels) and decay (e.g. including cascade chains involving leptons and/or weak 
bosons). 

The main motivation for our study comes from the work of Barr [10] , who showed that 
the angular distributions of leptons from the decay of lepton partners produced directly 
(via electroweak processes) in hadron collisions carry information about the lepton partner 
spin. This example is particularly simple since the lepton partners are always produced in 
s-channel quark collisions, and their angular distribution in the production frame is almost 
unambiguously determined by their spin. However, its utility is somewhat limited by the 
small cross section of the direct lepton partner process. For quark partners, the cross sections 
are larger, but the production mechanism is more complicated: both quark-initiated and 
gluon-initiated processes need to be included, and in both cases there are both s-channel 
and t-channel diagrams. Still, as we will show, the model-dependence of the matrix elements 
can be sufficiently strong to yield observable differences between the models. Crucially, the 
differences cannot be removed by simply varying the free parameters of the model with 
the wrong spin assignments: to demonstrate this, we performed a scan over the parameter 
space of the "untrue" model. (A recent model-discrimination study of Hubisz et. al. 
which studied a situation similar to our test case, also noted the significant differences in jet 
distributions between models with different spins, but was restricted to a single benchmark 
point in each model's parameter space. The importance of scanning over parameters in 
model-discrimination studies has been recently emphasized in Refs. [8| IT2"].) 

The rest of the paper is organized as follows: In section 2, we give a description of 
the minimally supersymmetric and Little Higgs models used in our study, including input 
parameters and particle spectra as well as the relevant production processes and decay 
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Table 1: Cross sections of squark pair-production processes at the LHC at the study point 
of Eq. (pQ). Here q = u,d, s, c. Factorization and renormalization scales are set to 500 GeV. 
The CTEQ6L1 parton distribution functions are used. The matrix elements are evaluated 
at tree level using MadGraph/MadEvent, and no K-factors are applied. 



chains. We discuss the dominant standard model backgrounds and the selection cuts that 
were applied to isolate signal events. We then define our observables and describe the 
statistical analysis. Section 3 contains the results of our model scan, including exclusion 
plots for 200 pb _1 , 500 pb _1 , 1 fb _1 and 2 fb _1 of integrated luminosity. We summarize our 
conclusions in section 4. Finally, the appendix includes a list of formulas for error estimates, 
a description of our method to calculate covariances, as well as an example of the angular 
distribution of jets at an excluded LHT model point. 

2 Setup 

We will focus on the discrimination between the minimal versions of the MSSM and the 
Littlest Higgs with T-parity (LHT) [131 HH EH1 HSj- Both models have been extensively 
studied in the literature; for reviews, see Refs. [T7JCB]. Each model contains color-triplet 
massive partners for each SM quark: squarks in the MSSM and T-odd quarks, or TOQs, 
in the LHT. Also, each model contains a stable weakly-interacting particle: the neutralino 
of the MSSM and the "heavy photon" (the T-odd partner of the hypercharge gauge boson) 
of the LHT. We will assume that these are the only particles that play a role in the LHC 
phenomenology; the rest of the new states in each model are too heavy to be produced. 
Note that the two minimal models have important differences in their particle content: for 
example, the minimal LHT does not have a color-octet heavy particle, a counterpart of the 
gluino; while the MSSM does not have a T-even partner of the top quark present in the 
LHT [T9l [20] . In our scenario, however, neither of these particles is observed. This null 
result does not help with model discrimination, since we don't know whether the particles 
don't exist or are simply beyond the LHC reach. Model discrimination must rely on the 
observed properties of the produced exotic particles or their decay products. 

Our strategy is to simulate a large sample of events corresponding to one of the models 
(we will choose the MSSM) with fixed parameters, and treat this sample as "data". The 
question is then, how well can this data be fitted with the alternative model, in this case the 
LHT? It should be emphasized that the predictions of the LHT model are not unique, but 
depend on the LHT parameters. So, when fitting data, one should look for the point in the 
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LHT parameter space that provides the best fit. The LHT can be said to be disfavored by 
data only to the extent that this best-fit point is disfavored. 



2.1 



Data 



For our case study, we assume that the MSSM is the correct underlying theory, with the 
following parameters: 



All parameters are defined at the weak scale, and no unification or other high-scale inputs are 
assumed. The parameter choices are driven by the desire to study a point with very simple 
collider phenomenology: the new physics production at the LHC is completely dominated 
by pair-production of the first two generations of squarks. The total squark pair-production 
cross section is 5.0 pb. Table [H lists the 22 leading squark pair-production processes, which 
together account for over 98% of the total. Associated squark-gluino production is strongly 
suppressed by the high gluino mass; the cross section (summed over squark flavors) is only 11 
fb. Associated squark-neutralino production is larger, with the total cross section of about 
290 fb. However, these events have only a single hard jet, and will not pass the analysis cuts 
(see section E3J). Production of third generation squarks is also strongly suppressed, with a 
total cross section of only 17 fb. Thus, in our analysis we will simulate the processes listed 
in Table [H and ignore all other SUSY production channels. 

Another simplification that occurs at the chosen parameter point is in the decay pattern 
of the produced squarks: they decay into quarks and the lightest neutralino (essentially a 
bino) with a 100% probability. This means that in this model, the only place where strong 
evidence for new physics would show up at the LHC is the two jets+missing energy channel. 
We will limit our study to this channel. 

The "data" event sample has been generated in the following way. First, we simulate a 
sample of parton-level events using the MadGraph/MadEvent package [21]. The production 
processes included in this simulation, and their cross sections, are listed in Table [B The 
squark decays are also handled by MadGraph/MadEvent, using the narrow-width approxi- 
mation. The sample size corresponds to 10 fb -1 of integrated luminosity at the LHC. The 
resulting events are stored in a format compatible with the Les Houches accord, and then 
passed on to PYTHIA [22] to simulate showering and hadronization. The PYTHIA output is 
then passed on to the detector simulation code. We use a modified version of the PGS code 
to perform fast (parametrized) detector simulation. The drastic speed-up of the event sim- 
ulation provided by PGS (compared to full CMS detector simulation) allows us to scan the 
LHT parameter space, generating a statistically significant event sample for each point in 
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Figure 1: Jet pr and missing transverse energy distributions in the MSSM, obtained with 
PGS (red histograms) and full CMS simulation (black histograms). Left panel: Uncorrected 
PGS. Right panel: A jet energy scale correction factor has been applied to the PGS output. 



the scan. To calibrate PGS to the CMS detector, we have generated two calibration event 
samples (one in the MSSM and one in the LHT) using the full CMS detector simulation, and 
compared them to the PGS output for the same two underlying models. On the basis of this 
comparison, we determined that the energy and angular distributions of the PGS jets are in 
excellent agreement with the full CMS simulation, once the jet energy has been appropriately 
corrected. This is clear from Fig. [TJ which shows the jet pt and missing transverse energy 
distributions in the MSSM, obtained with PGS (red histograms) and full CMS simulation 
(black histograms). For jets satisfying the selection criteria of our analysis (in particular, 
p™ m = 100 GeV), the correction factor is essentially the same as the one appearing in trans- 
lation from parton-level jet energy to the energy reconstructed by the detector [25] (i.e., the 
PGS output in this px range essentially corresponds to parton-level jets). We have applied 
this correction factor to the PGS output throughout our analysis. 

2.2 Little Higgs Model 

If the only evidence for new physics is in the two jets+missing energy channel, it is natural 
to try to fit the data with the LHT model, assuming the dominant production process 

pp^Ufi't, (2) 
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Table 2: Signal and Background cross sections (in pb), where a n denotes the cross section 
after cuts 1 to n (see text for description). Also listed are the total number of events 
simulated for our study. 



where U[ is the TOQ of flavor i. We will assume that four flavors of TOQs, % — u,d, s, c, 
are degenerate at mass Mq and are within the reach of the LHC, with the other two flavors 
being too heavy to play a role. Once produced, TOQs promptly decay via 

Ui^ qi B', U'i^qiB 1 , (3) 

giving a 2 jets+MET signature. Here B' is the lightest T-odd particle (LTP), the heavy 
photon of mass Mb- The LHT predictions in this channel are sensitive to only two model 
parameters, Mq and Mb, which allows us to scan the parameter space with realistic com- 
puting resources. The counterpart of the process (j2j), (IH)) in the pp collisions at the Tevatron 
was considered in Ref. [23] . The Tevatron experiments exclude a region in the Mq — Mb 
plane: roughly, they place a lower bound of Mq ^ 350 GeV for light B', Mq — Mb ^ 250 
GeV, and somewhat weaker bounds for heavier B' . (There is no bound if Mq — M B ^ 50 
GeV.) 

To assess how well the data can be fitted with the LHT model, we perform a scan in 
the (Mq, Mb) plane. We have picked 125 points in the LHT parameter space, uniformly 
scanning in the ranges 

M Q = [500, 950] GeV , 

M B = [100 GeV, M Q ]. (4) 

For each point in the scan, we generate an event sample using the procedure outlined in 
section I2TT1 above. Each sample corresponds to 10 fb _1 of integrated luminosity at the LHC. 

2.3 Backgrounds 

Several Standard Model processes contribute to the jets + missing energy final state. The 
following background processes are dominant: 
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• Z+2 jets, with Z decaying invisibly (irreducible background); 

• W+2 jets, with W decaying leptonically and the charged lepton misidentified or un- 
detected; 

• W+l jet, with W decaying to tu t , the r decaying hadronically and misidentified as a 
jet 

• tt, with at least one of the top quarks decaying leptonically and the charged lepton(s) 
misidentified or undetected. 

The cross sections for each process are listed in Table [2j (For the Z/W+jets channels, 
we list the parton-level Z/W + 2 jets cross sections with p J T > 100 GeV.) We simulated 
two independent Monte Carlo samples for each process. One of the samples is mixed with 
the SUSY events to obtain the "data" sample, while the other one is mixed with the LHT 
events and used to fit the data. The size of each sample corresponds to 2 fb _1 of LHC 
data. All samples have been simulated following the same simulation path as for the signal: 
parton-level simulation with MadGraph/MadEvent, followed by showering and hadronization 
simulation with PYTHIA and a parametrized detector simulation with the modified PGS. It 
should be kept in mind that some of the CMS detector performance parameters which affect 
the background rates, such as lepton misidentification probabilities, may not be realistically 
modeled by PGS. In principle one could normalize these parameters using full CMS detector 
simulation, as we did for the jet-energy corrections. However, given the preliminary nature 
of our study, we did not attempt such normalization. 

In addition to the processes listed above, pure QCD multi-jet events with mismeasured 
jets leading to apparent missing energy are expected to make an important contribution to 
the background. However, until the detector is calibrated with real data, it is difficult to 
predict this background. We have not included it in this preliminary analysis. 

2.4 Triggers and Selection Cuts 

Throughout the analysis, we impose the following cuts: 

1. At least two reconstructed jets in the event 

2. p T (ji) > 150 GeV 

3. p T (j 2 ) > 100 GeV 
4- vUi) < 1-7 

5. V(J2) < 1-7 

6. No identified leptons (e,/i or r) in the event 

7. $r > 300 GeV 
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where the jets are labeled according to their pr, in descending order. We do not impose 
any explicit cuts on jet seperation, since jet reconstruction in PGS effectively acts as a 
minimum separation cut. The LHC data samples will correspond to certain trigger paths, in 
our case to the fir trigger and to jet triggers. Using simple parametrizations for the trigger 
efficiencies [21] we expect them to be essentially 100% efficient, given our selection cuts. 

The signal and background cross sections passing each of the selection cuts are listed in 
Table [2j After the cuts are applied, we obtain 



for the SUSY signal. The S/B value is not as good as those obtained in some existing 
studies of SUSY search prospects (see, for example, Refs. [27J CI])- The reason is that in 
those analyses gluinos are assumed to be light, around 500 GeV, which greatly increases the 
signal cross section and also yields three or more hard jets in the final state in most events, 
allowing to further suppress the background. Still, the relatively large new physics cross 
section implies that if reasonably accurate predictions of the background rate are available, 
the presence of new physics can be convincingly established. In particular, using the 10 
observables listed below and the assumptions about the systematic and statistical errors 
described in Appendix |XJ we estimate that the existence of new physics in this channel will 
be established at the level of 2.5, 4.2, and 4.9 sigma, with analyzed data samples of 200 pb" 1 , 
500 pb _1 , and 1 fb _1 , respectively. The discovery is dominated by shape observables: if the 
total rate, which may suffer from large uncertainties in the MC predictions, is removed from 
the fit completely, the confidence levels are only marginally lower. 

2.5 Observables 

Our analysis uses the following observables: 

• (Teflf: The cross section, in pb, of events that pass the analysis cuts. Experimentally, 
this quantity is inferred from the measured event rate using iV bs = £i nt a e fj, where 
iV bs is the number of events passing the cuts in a sample collected with integrated 
luminosity £ int . It is related to the total production cross section by a C R = J2i a iEii 
where the sum is over all processes (signal and background) which contribute to the 
sample, and <7j and Ei are the total cross section and combined trigger/cuts efficiency, 
respectively, for channel i. 

• (pr)'- The average transverse momentum of all jets with pr > 100 GeV in a given data 
sample that pass the analysis cuts. This variable is tightly correlated with the mass 
difference between TOQ and the LTP, Mq — Mb- 

• ( | £77 1 ): The average of the absolute value of the sum of the pseudo-rapidities of the 
two leading (highest-pr) jets in the event. 



S/B = 1.0, 




(5) 



S 



• (Ht)' The average of the scalar sum of the transverse momenta of all jets in the event 
plus the missing transverse energy 

jets 

• (fir), the average of the missing transverse momentum in the events that pass the 
selection cuts. 

• Beam Line Asymmetry (BLA): This observable is defined as (AT+ — NJ)/(N + + AL), 
where N + and AL are the numbers of events with 771772 > and 771772 < 0. 

• Directional Asymmetry (DA ): The same as above, where AL (AL) are now the numbers 
of events where p\ -p2 is positive (negative) H. A plot showing the distribution of relative 
angles between the two hardest jets can be found in Figj6] in the appendix. 

• Transverse momentum asymmetry (PTA): The ratio N + /N~ of the number of jets 
with pt larger than (p?) and the number of jets with smaller than (pr)- 

• Transverse momentum bin ratios: We distribute the jets in the event into three fixed 
bins, depending on their transverse momentum. The first bin corresponds to 100 
GeV< p T <300 GeV (Nx events), the second to 300 GeV< p T <500 GeV (A^ 2 events), 
and the third to pr >500 GeV (A^ events). We then define two bin count ratios, 
R x = N 2 /N 1 and R 2 = N 3 /N v 

We compute the "measured" values of these observables using the "data" sample. For 
each LHT point in the scan, we compute the expected central values of the observables 
using the corresponding MC sample. We then use the standard \ 2 technique to estimate the 
quality of the fit between the expected and measured values. The observables are assumed 
to be Gaussian distributed, with the variances including statistical and systematic errors 
added in quadruture. The correlation matrix between observables for each LHT point is 
obtained from the generated Monte Carlo sample; the details of the procedure and error 
analysis are described in Appendix [A] As an example, we show the correlation matrix for 
our susy "data" sample in Table [3j The quality of the fit to data at each LHT point is 
quantified by the \ 2 value, which can in turn be converted into probability that the observed 
disagreement between the measured and expected values of the observables is the result of a 
random fluctuation. (If this probability is close to one, the fit is perfect; if it approaches zero, 
the fit is very poor.) As a sanity check to validate our statistical procedure, we simulated a 
large number of independent subsamples of SUSY and SM background events, and confirmed 
that the distribution of x 2 values agrees with statistical fluctuations. 

2 For a recent analysis using BLA and DA in a context similar to ours, see Ref. [55] 
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Table 3: Correlation matrix of observables in the SUSY plus SM background "data" sample, 
generated from 2 fb _1 of simulated events with 50 subsamples and 10,000 iterations. A 
description of the procedure used to calculate this matrix can be found in appendix [A] 

3 Results 

The main results of the analysis are presented in Fig. [21 which shows the level at which the 
LHT model is excluded depending on the assumed values of the parameters. For illustration 
purposes, we label the exclusion contours by the number of standard deviations in a single- 
variable Gaussian distribution corresponding to the same probability. With 200 pb" 1 of 
accumulated data, the combined fit to the 10 observables excludes only about half of the 
LHT parameter space at better than 3-sigma level, or better than 99.7% confidence level. In 
the rest of the parameter space the LHT model is still consistent with data at this level, with 
the best-fit point at Mq = 650 GeV, Mb = 250 GeV showing a less than 1-sigma deviation 
from the data. With more integrated luminosity and correspondingly smaller statistical 
errors, however, the LHT model can no longer fit the data. For 2 fb _1 , we find that the 
complete LHT parameter space in our study is excluded at a more than 3-sigma level, and 
most of the parameter space is already excluded at a 5-sigma level. Thus, it appears that 
in our test-case scenario, experiment can exclude the LHT interpretation of the data with a 
modest integrated luminosity of only a few fb _1 . 

While we include the estimates of the systematic uncertainties for all observables in our 
study, some of the observables may suffer from additional uncertainties. One example is 
the total production cross section. We assumed a 30% systematic error on the value of 
the cross section computed in the LHT model, to account for the scale uncertainty of the 
leading-order calculation, as well as pdf and luminosity uncertainties. However, other effects, 
for example the possibility that the number of degenerate TOQ flavors is different from the 
assumed value (four), the possible presence of additional TOQ decay channels, etc., could 
significantly change this observable, keeping all others intact. Thus, it is interesting to fit the 
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Figure 2: Exclusion level of the LHT hypothesis, based on the combined fit to the ten 
observables discussed in the text. Top left panel: with integrated luminosity of 200 pb _1 at 
the LHC Top right panel: same, with integrated luminosity of 500 pb~ x , Bottom left panel: 
integrated luminosity of 1 fb" 1 , Bottom right panel: 2 fb^ 1 . 



data with the LHT model without using the cross section information at all. Interestingly, 
this fit leads to exclusion of the LHT model at levels not much weaker than the original fit, 
see Fig. [31 In other words, the cross section information does not seem to play a crucial role 
in model discrimination: a combination of transverse-momentum and angular distributions 
of the two jets is sufficient. This is certainly reassuring. We have also performed a fit without 
using the average missing transverse momentum and observables, which may suffer from 
unexpected instrumental systematics. The results are shown in the right panel of Fig. [3j 
The impact of removing these observables is more significant; some parameter values in the 
LHT model are now no longer excluded at the 3-sigma level. If those two observables are 
not included, it would therefore be necessary to increase the integrated luminosity to arrive 
at the same confidence level for the rejection of the Little Higgs hypothesis. 
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Figure 3: Exclusion level of the LHT hypothesis, based on the combined fit to nine/eight of 
the ten observables discussed in the text, with integrated luminosity of 2 fb _1 at the LHC. 
Omitted are the total production cross section (left panel) and missing transverse momentum 
and Ht (right panel). 

4 Conclusions 

Using Monte Carlo samples, we determined x 2 values for fitting a SUSY + BG "data" sample 
with Little Higgs model predictions, using the heavy TOQ and "heavy photon" masses as 
fit parameters and including dominant standard model backgrounds. 

With 2 fb _1 of signal and background events, we were able to show that a combination 
of ten observables encoding angular and transverse momentum distributions of the observed 
jets contains enough information to exclude the LHT model at a 3-sigma confidence level, 
provided that these distributions for both models and the dominant Standard Model back- 
grounds can be reliably predicted by Monte Carlo simulations. We found that neither the 
effective cross section, which depends on potentially unknown decay branching ratios, nor 
information about the missing energy is crucial for this method of model discrimination. 

In reality, it is likely that the LHC phenomenology is much richer than the simple scenario 
described here, involving, for example, competing SUSY production processes and compli- 
cated decay chains. In this case, the mo del- discrimination analysis would involve multiple 
channels, and more new particles (and hence parameters) would be required to fit. However, 
while the details are highly model-dependent, it should be conceptually straightforward to 
extend our analysis to such situations. 
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A Error estimates 

Our estimates for the significance level of model exclusion rely on correct evaluation of 
statistical and systematic errors. We therefore include a summary of formulas used in our 
analysis. We use three fundamentally different types of observables. The first class consists 
of averages of measured quantities like the mean jet pt and the mean H t of events. Secondly, 
asymmetries in event shapes, as well as bin ratios, are obtained by counting events that do 
or do not fulfill certain conditions. Finally, the cross section is calculated from the total 
number of signal and background events after cuts. 



A.l Mean value observables 

The error in transverse momentum of individual jets is estimated using the parameteri- 
zation given in |27j . 

+ 1,25 + n 033 | n meas 
/t GS J 

where all momenta are in GeV, p T GS is the transverse momentum obtained from PGS, and 
pmeas j g rescaled momentum as in |25j. 

The average transverse momentum observable (pt) is calculated by taking the mean of 
all jets with a minimum px of 100 GeV in the events that pass our selection cuts. 

The missing transverse energy as given by PGS has to be corrected to account for the 
change in jet energy scales. The modified missing transverse energy is 



^ m ea S = ^PGS + EW GS -^ eaS ) 



i=l 



where the sum is a vector sum in the transverse plane. 

The error in the missing transverse energy fsp is estimated as 

a 2 ^ = (3.8 GeV) 2 + 0.97 2 GeV + (0.012 fir) 2 

as given in [27]. 

The observable (Ht) is given by the scalar sum of the transverse momentum of all objects 
plus the missing energy in the event. The error of this quantity is calculated by adding the 
errors of each object and the missing energy in quadrature. 

Given a list of individual measurements of the jet pt, Pt, or Ht, the statistical error of 
the mean value is given by 

4at = V/N, 
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where V is the variance of the distribution and N is the number of entries. The systematic 
error is given by 

<st = v 2 /N, 
where v is the mean value of the distribution. 

For the average sum of leading jet pseudorapidities the statistical error is estimated 

as above, while the systematic error is calculated from 

a 2 - 

yst 2iV ' 

where the r\ cell width w c is 0.087. 

A. 2 Counting type observables 

Given two bins of events N + and N~ , we define the asymmetry as 

N + - N~ 
~ N+ + N-' 
We assume purely statistical errors given by 



which leads to an asymmetry error of 



2 4N+N 
a 2 



A (N+ + N-) 3 ' 
Given the same two bins, we define the event ratio as 

7V+ 



R 



N- 

The statistical error is then given by 

iV+iV- + (iV+) 2 



a 



A. 3 Cross Section 

The cross section after cuts a e g is given by 

°"cff = ^obs/Ant) 

where C- lxA is the integrated luminosity and N ^ s the observed number of events. The statis- 
tical error is given by 

0"eff 

C"stat 



and the systematic error is estimated as 30 percent, 

Osyst = 0.3 0" e ff. 
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Figure 4: Histogram of the iV repeat values of the correlation between (Ht) and (fir) obtained 
by applying the bootstrapping procedure on 2 fb _1 of SUSY plus background events, with 
N su b = 20 and iV repeat = 10, 000. The mean value of the distribution is 0.66, as given in 
Table EJ 

A. 4 Covariance Matrix Estimate 

Preserving information about the expected correlations of observables can considerably in- 
crease or decrease x 2 values, depending on the relative signs of observed deviations from 
the expected mean values. It is therefore highly desireable to estimate the elements of the 
covariance matrix in a consistent way. 

Since it is not possible to calculate the covariances of all observables Oi analytically, we 
have to rely on an estimate based on a sample of Monte Carlo simulations. 

In an ideal world, we would simulate a full sample corresponding to the desired luminosity 
at each Little Higgs model point (including standard model backgrounds) N$ times and 
estimate the covariance matrix from 

V ab = ((O a -(O a ))((O b -(O b ))), 

where () denotes the mean over the N$ sets. Because of limited computing resources, this is 
not feasible and we have to estimate the correlations from existing subsets of events for each 
data point. We use a bootstrapping procedure, where we randomly select iV sub subsamples 
from 2 fb _1 of signal plus background events. We calculate the correlation matrix from 
those subsamples, repeat the procedure Nr times and then calculate the mean values of the 
correlation matrix elements, 




where V^ l > and are the covariance and standard deviations obtained from the iV su b 
subsamples in iteration i. Those average matrix elements are then assumed to be the correct 
correlations of the observables in the full sample. A histogram of results obtained by this 
procedure is shown in Fig. HI 

Finally, we assume that the correlation is independent of the sample size, and extrapolate 
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Figure 5: Exclusion level of the LHT hypothesis, based on the combined fit to the ten 
observables discussed in the text at 2 fb _1 . Left: As in Fig. [2], including correlations be- 
tween observables as determined by the bootstrapping procedure. Right: Assuming that all 
observables in the Little Higgs model are uncorrelated. 



to find the covariances for the full set of events 

V a b = C a b(J a ab, 

where the standard deviations for the two observables a a and <Jb are now calculated from the 
full sample and include both statistical and systematic errors. 

We verified that this procedure produces the correct x 2 probability distribution function 
for the model distance between subsample and full sample observables. 

Since the selection of subsample events that have passed our cuts is randomized, no 
information about the correlation of the cross section with the other observables can be 
obtained by this method, and we assume that the cross section is uncorrelated. 

Fig. [5] illustrates the importance of including correlation information. Assuming uncor- 
related observables, a small fraction of points in the LHT parameter space are found to be 
excluded at a higher confidence level. However, the exclusion level of the best fit point is 
lowered significantly, and so the LHT model can no longer be rejected at the 3-sigma level. 

B Angular distribution of jets 

As an example, we show the relative angular distribution of the two hardest jets in the SUSY 
"data" sample and for the Little Higgs model with (rriQ = 500 GeV, m# = 100 GeV). The 
directional asymmetry is -0.079 ± 0.019 for the SUSY "data" and 0.008 ± 0.017 for the LHT 
+ BG sample. Using just this observable, the \ 2 between the two models is 26.19, so that 
it would be excluded at the 5-sigma level. 
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Figure 6: Distribution of the cosine of the angle 9 12 between the two hardest jets in the 
SUSY sample ("data" points), as well as the prediction from the LHT model (histogram) 
with parameters tuq = 500 GeV, mg = 100 GeV. 
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